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ABSTRACT 

Vascular endothelial growth factor (VEGF) proximal 
promoter region contains a poly G/C-rich element 
that is essential for basal and inducible VEGF 
expression. The guanine-rich strand on this tract 
has been shown to form the DNA G-quadruplex 
structure, whose stabilization by small molecules 
can suppress VEGF expression. We report here the 
nuclear magnetic resonance structure of the major 
intramolecular G-quadruplex formed in this region 
in K + solution using the 22mer VEGF promoter 
sequence with G-to-T mutations of two loop 
residues. Our results have unambiguously 
demonstrated that the major G-quadruplex formed 
in the VEGF promoter in K + solution is a parallel- 
stranded structure with a 1:4:1 loop-size arrange- 
ment. A unique capping structure was shown to 
form in this 1:4:1 G-quadruplex. Parallel-stranded 
G-quadruplexes are commonly found in the human 
promoter sequences. The nuclear magnetic reson- 
ance structure of the major VEGF G-quadruplex 
shows that the 4-nt middle loop plays a central 
role for the specific capping structures and in 
stabilizing the most favored folding pattern. It is 
thus suggested that each parallel G-quadruplex 
likely adopts unique capping and loop structures 
by the specific middle loops and flanking 
segments, which together determine the overall 
structure and specific recognition sites of small 
molecules or proteins. 

LAY SUMMARY: The human VEGF is a key regu- 
lator of angiogenesis and plays an important role in 
tumor survival, growth and metastasis. VEGF 
overexpression is frequently found in a wide range 



of human tumors; the VEGF pathway has become 
an attractive target for cancer therapeutics. DNA 
G-quadruplexes have been shown to form in the 
proximal promoter region of VEGF and are 
amenable to small molecule drug targeting for 
VEGF suppression. The detailed molecular structure 
of the major VEGF promoter G-quadruplex reported 
here will provide an important basis for structure- 
based rational development of small molecule 
drugs targeting the VEGF G-quadruplex for gene 
suppression. 

INTRODUCTION 

The human vascular endothelial growth factor (VEGF) is 
a pluripotent cytokine and a key regulator of angiogen- 
esis. VEGF plays an important role in tumor survival, 
growth and metastasis (1,2). It binds to VEGF receptors 
on the surfaces of endothelial cells to promote the forma- 
tion of new blood vessels, or angiogenesis, which can 
promote tumor growth by providing oxygen and nutrients 
as well as provide escape routes for disseminating tumor 
cells (3,4). VEGF overexpression is frequently found in a 
wide range of human tumors (5-8) and can be induced by 
the loss or inactivation of tumor suppressor genes (9), the 
activation of oncogenes (10), external stimuli such as 
hypoxia and cytokines (11,12) and transcriptional 
upregulation, which is controlled by the c/s-acting 
elements and transcription factors (5-9,13). Anti-VEGF 
therapy has been actively pursued for cancer therapeutics 
in a variety of forms, including antibodies, ribozymes, 
immunotoxins and small molecule inhibitors (14-23). 

The G-quadruplexes formed in oncogene promoters 
have been shown to be potential targets for small 
molecule drugs (24-26). Most recently, the existence of 
DNA G-quadruplex has been visualized on chromosomes 
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in human cells using a G-quadruplex- specific antibody 
(27). One region proximal to the transcription initiation 
site, a 39-bp polyG/polyC region located —88 to —50 bp 
relative to the transcription initiation site, has been shown 
to be functionally significant in VEGF transcriptional 
activity with multiple transcription factor binding sites, 
including three potential Spl binding sites (13). This 
region has been shown to be highly dynamic in conform- 
ation and can form DNA G-quadruplex secondary struc- 
ture on the G-rich strand, as demonstrated by in vitro and 
plasmid footprinting with dimethyl sulfate (DMS), 
DNase I and SI nuclease in K + (28,29), and by in vivo 
DMS footprinting using A498 kidney cancer cells 
that overexpress VEGF (30). The formation of 
DNA G-quadruplex structure is clearly enhanced by 
G-quadruplex-interactive agents (28), which can repress 
VEGF expression in human tumor cells (31), suggesting 
that the VEGF G-quadruplex is amenable to small 
molecule drug targeting for VEGF suppression. A 
detailed molecular structure of the major VEGF 
promoter G-quadruplex will be important for structure- 
based rational development of small molecule drugs (32). 

We report here the nuclear magnetic resonance (NMR) 
structure of the major G-quadruplex formed in the human 
VEGF promoter in K + solution. Our NMR study un- 
equivocally demonstrated that the major intramolecular 
G-quadruplex formed in the VEGF promoter in K + is a 
parallel-stranded structure with 1:4:1 loop-size arrange- 
ment. We have found that the middle 4-nt loop interacts 
with the 5' flanking residues to form a specific capping 
structure, a salient feature as this interaction is specific 
to the VEGF sequence and differs from those other 
parallel-stranded structures. Together with the 5'- 
flanking segment, the 4-nt middle loop appears to play a 
central role in forming the specific capping structure that 
likely determines this most favored folding pattern. 
Parallel-stranded G-quadruplexes have been found to be 
common in the human promoter sequences. Significantly, 
our results indicate that each parallel structure is likely to 
adopt unique capping and loop structures by the specific 
flanking sequences and middle loops, which together de- 
termine the stability of the overall G-quadruplex structure 
and potential specific interactions with small molecules or 
proteins. 



MATERIALS AND METHODS 

The synthesis and purification of DNA oligonucleotides 
was done as described earlier (33-37). Water samples were 
prepared in 90%/10% H 2 0/D 2 0 solution. Samples in 
D 2 0 were prepared by repeated lyophilization and final 
dissolution in 99.96% D 2 0. The final NMR samples con- 
tained 0. 1-2.5 mM DNA in 25 mM K-phosphate buffer 
(pH 7.0), 70 mM KC1. 

Circular dichroism (CD) spectroscopic study of the 
oligonucleotides was performed on a Jasco J-810 
spectropolarimeter (Jasco Inc., Easton, MD, USA) 
equipped with a thermoelectrically controlled cell holder 
as described previously (38). The quartz cell of 1 mm 
optical path length was used. A blank sample containing 



only buffer was used for the baseline correction. CD spec- 
troscopic measurements were the averages of three scans 
collected between 200 and 350 nm. The scanning speed of 
the CD instrument was 100 nm/min, and the response time 
was 1 s. T m values were measured by CD melting and 
annealing experiments performed at 265 nm for three 
repeats, with a heating or cooling rate of 2°C/min, 
respectively. 

NMR experiments were performed on a Bruker DRX- 
600 MHz spectrometer as discussed earlier (33-37). 
Stoichiometric titration of the unfolded and folded 
strands as a function of total strand concentration from 
0.01 to 0.1 mM was performed at 75°C (melting point) 
(39). The guanine HI imino protons, one-bond coupled 
to Nl, and H8 protons, two- bond coupled to N7, can be 
unambiguously assigned by ID 15 N-edited heteronuclear 
multiple quantum coherence (HMQC) experiments (40). 
For this purpose, site-specific labeled DNA synthesis 
with 6% f5 N-labeled- guanine phosphoramidite (41) was 
used. The ID GE-JRSE HMQC experiments were used 
for measuring 15 N-edited spectra (40) to identify guanine 
imino and H8 protons. The ID variable temperature (VT) 
proton NMR experiments were done in the range from 
1°C to 80°C. Homonuclear 2D-NMR experiments 
double quantum filtered-correlation spectroscopy (DQF- 
COSY), total correlation spectroscopy (TOCSY) and 
nuclear overhauser effect spectroscopy (NOESY) were 
collected at 5, 15 and 25°C for complete proton resonance 
assignment in water and D 2 0 solution. The contribution 
from J-modulation and zero quantum coherence effect 
was suppressed by using z-gradient filter having gradient 
strength 20% and a duration of 1 ms. The NMR experi- 
ment for samples in water were performed using Jump- 
return spin-echo water suppression technique in which 
water peak was suppressed with maximum intensity 
tuned to 11 ppm (42). Relaxation delays were set to 2.5 s. 
The acquisition data points were set to 4096 x 512 
(complex points). Peak assignments and integrations 
were achieved using the software Sparky (UCSF). Non- 
exchangeable protons were estimated based on the 
Nuclear Overhauser Effect (NOE) cross-peak volumes at 
50-300 ms mixing times, with the upper and lower 
boundaries assigned to ±20% of the estimated distances. 
Distance restraints for the unresolved cross-peaks were set 
with looser boundaries of ±30%. The cytosine base 
proton H6-H5 distance (2.45 A) was used as a reference 
distance. The distances involving the unresolved protons, 
e.g., methyl protons, were assigned using pseudo-atom 
notation to make use of the pseudoatom correction auto- 
matically computed by X-PLOR. 

The structure of Pu22-1213T was calculated using 
X-PLOR (43). Metric matrix distance geometry and 
simulated annealing calculations were carried out in X- 
PLOR (43) to embed and optimize 100 initial structures 
based on an arbitrary extended conformation of the 
single-stranded Pu22-1213T sequence to produce a 
family of 100 DG structures, as described previously 
(33,34). The experimentally obtained distance restraints 
and G-tetrad hydrogen bonding distance restraints were 
included during the calculations. All of the 100 molecules 
obtained from the distance geometry simulated annealing 
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(DGSA) calculations were subjected to NOE-restrained 
Simulated Annealing refinement in X-PLOR (43) with a 
distance-dependent dielectric constant. A total of 407 
NOE distance restraints were introduced into the NOE- 
restrained structure calculation with a force constant of 
20kcal mol -1 A -2 . Hydrogen bond restraints were applied 
to the G-tetrads, using a quadratic energy function with a 
force constant of lOOkcal mol -1 A -2 . A low-level 
planarity restraint (2kcal mol -1 A -2 ) was also applied 
on the G-tetrad in the simulated annealing step of the 
structure calculation. The planarity restraints were 
removed in the final molecular dynamics simulation with 
energy minimization. Dihedral angle restraints were used 
to restrict the glycosidic torsion angle for the experimen- 
tally assigned anti conformation bases and for tetrad- 
guanines. The 30 best molecules were selected based 
both on their minimal energy terms and number of 
NOE violations and were further subjected to NOE-re- 
strained molecular dynamics calculations at 300 K for 25 
ps. The coordinates saved at every 0.1 ps during the last 
2.0 ps of NOE-restrained molecular dynamics calculations 
were averaged, and the resulting averaged structure was 
subjected to further minimization until the energy gradient 
of O.lkcal mol -1 was achieved. The 10 best molecules 
were selected based both on their minimal energy terms 
and number of NOE violations with the mean rms devi- 
ation of 1.10 A for the family of 10 ensemble structures. 
For the G-quadruplex formed in the wild-type 
VEGF_Pu22 sequence, we took the G-quadruplex 
formed in Pu22-T12T13 as the starting structure and 
replaced T12 and T13 with the wild-type G12 and G13 
residues. This structure was then subjected to energy mini- 
mization followed by unrestrained molecular dynamics 
simulation for 25 ps at 300 K. 

RESULTS 

The major G-quadruplex formed in VEGF Promoter in 
K + solution adopts a parallel-stranded structure with 1:4:1 
loop-size arrangement 

The G-rich strand of this VEGF proximal promoter 
region contains five guanine-runs. Using electrophoresis 
mobility shift assay (EMSA), DMS footprinting and 
DNA polymerase stop assay in K + solution, it has been 
shown that the G-quadruplex formed in this region 
involves only the 5' four successive G-runs (VEGF-Pu22, 
Figure 1A) (29,31), which contain four (G2-G5), three 
(G7-G9), five (G12-G16) and four (G18-G21) guanines, 
respectively. VEGF-Pu22 can form multiple loop 
isomers. The wild-type VEGF-Pu22 forms a clear predom- 
inant G-quadruplex structure in 95 mM K + solution, as 
shown by a set of imino proton peaks at 10.5-12 ppm in 
*H NMR, characteristic of G-tetrad guanines (Figure IB). 
The CD spectrum of VEGF-Pu22 showed a positive peak 
~265 nm and a negative peak at 240 nm (Supplementary 
Figure SI), characteristic of a parallel-stranded G- 
quadruplex structure (38). We prepared the wild-type 
VEGF-Pu22 sequence with 6% site-specific incorporation 
of 15 N-labeled-guanine at each guanine of the 5G-run 
G12-G16 (Figure 1A). The imino protons of G14, G15 



A 1 2 3 4 5 6 7 89101 2 3 4 5 6 78 9201 2 

VEGF-Pu22 5 ' -C GGGG CGGGCC GGGGG C GGGG T-3 ' 

I II III IV 

PU22-T12T13 5 ' -CGGGGCGGGCCTTGGGCGGGGT-3 ' 

Pu22-T12T13A2A21 5 ' -CAGGGCGGGCCTTGGGCGGGAT-3 ' 




12!o 11 '.5 1l'.0 PPm 




12.0 11.5 11.0 ppm 



D 5-A E 



o 




Figure 1. (A) The promoter sequence of VEGF and its modifications. 
VEGF-Pu22 is the 22mer wild-type G-rich sequence needed for 
quadruplex formation; the four G-runs are numbered. Pu22-T12T13 
and Pu22-T12T13A2A21 are modified Pu22 sequences with mutations 
shown in cyan. Pu22-T12T13 and Pu22-T12T13A2A21 adopt the major 
1:4:1 parallel-stranded structure investigated in this study. The numbering 
system is shown above VEGF-Pu22. (B) The imino region of ID l H NMR 
spectra of the wild-type VEGF-Pu22 and Pu22-T12T13. (C) The imino 
region of ID l H NMR spectra of the wild-type VEGF-Pu22. Imino 
proton assignments of G12-G16 using ID 15 N-edited HMQC on site- 
specific-labeled VEGF-Pu22 at each of G12-G16 are also shown. 
Conditions: 25 mM K-phosphate, 70 mM KC1 (pH 7.0), 25°C. 
(D) Schematic drawing of the major 1:4:1 G-quadruplex formed in 
VEGF-Pu22 (G = red, C = yellow, T = blue). (E) A G-tetrad with HI- 
FI 1 and H1-H8 connectivity pattern detectable in NOESY experiments. 



and G16 were clearly detected in ID N-edited HMQC 
experiments, whereas the imino proton of G12 was weak 
and the imino proton of G13 was missing (Figure 1C); the 
imino proton of G13 was not detected even at 2°C 
(Supplementary Figure S2), indicating that the major con- 
formation formed in the wild-type VEGF-Pu22 does not 
involve G12 and G13 in the G-tetrad formation. Thus, the 
folding topology of the major G-quadruplex formed in 
VEGF-Pu22 is a parallel G-quadruplex with a 1:4:1 
loop-size arrangement (Figure ID). This major VEGF 
G-quadruplex can be isolated by the sequence Pu22- 
T12T13, with G-to-T mutations at positions 12 and 13 
(Figure 1A). Pu22-T12T13 gave rise to a well-resolved 
l H NMR spectrum in 95 mM K + solution (Figure IB) 
and was used for NMR structure determination. 

To determine the effect of loop and flanking residues, 
we have tested various modified VEGF sequences by l H 
NMR (Figures IB and Supplementary Figure S3). The 
spectrum of Pu22-T12 with G12-to-T mutation is almost 
the same as that of the wild-type VEGF-Pu22, indicating 
that G12 is involved in neither the tetrad formation nor 
the capping structure. The spectrum of Pu22-T12T13 is 
similar to that of the wild-type VEGF-Pu22, with the 
G7 imino proton down-field shifted, likely due to a 
smaller ring-current effect of T13 than that of G13 in 
the capping structure (see later in the text). The 
spectrum of Pu22-T12T13A2 showed a shifted G18 
imino proton, likely caused by a different base pair con- 
formation (T13:A2) of this modified sequence, whereas 
Pu22-T12T13A2A21 showed additionally shifted G20 
and G16 imino protons, likely due to the mutated A21 
base. 

The less stable 1:2:3 loop isomer can also be isolated in a 
modified VEGF sequence in K + solution 

Our result is consistent with the previous DMS 
footprinting data, which show that the 1:4:1 loop isomer 
is the predominant G-quadruplex formed in the wild-type 
VEGF promoter sequence in K + solution (29). It was sug- 
gested by DMS footprinting that a minor conformation, 
the 1:2:3 G-quadruplex (Supplementary Figure S4A), 
could also be formed (29). The 1:2:3 G-quadruplex 
needs G12 and G13 in the core G- tetrads and can be 
isolated using the Pu22-T15T16 sequence (Supplementary 
Figure S4A). Although the 15 N-edited HMQC experi- 
ments of the wild-type sequence VEGF-Pu22 in K + did 
not detect the formation of the 1:2:3 G-quadruplex, as the 
signals for the imino protons of G12 and G13 were either 
very weak or missing (Figure 1C), Pu22-T15T16 can 
form a single G-quadruplex in K + (Supplementary 
Figure S4B). The ID ! H spectrum of the wild- type 
sequence appears to show a minor species, likely to be 
the 1:2:3 loop isomer (Supplementary Figure S4B). It is 
possible that the HMBC experiment of the 6% 15 N-G- 
labeled DNA is not sensitive enough to detect the low 
population of the 1:2:3 loop isomer. The melting tempera- 
ture of the 1:4:1 G-quadruplex formed in Pu22-T12T13 is 
77.3°C, whereas the melting temperature of the 1:2:3 
G-quadruplex is 73° C (Table 1). The melting temperature 
of the wild-type VEGF-Pu22 is 77.9°C (Table 1), which is 
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Table 1. Melting temperature (T m ) values for various VEGF 22-mer 
DNA sequences a 



DNA Loop isomer T m (°C) 



VEGF-Pu22 1:4:1 77.9 

Pu22-T13 1:4:1 77.1 

Pu22-T12T13 1:4:1 77.3 

Pu22-T15 1:2:3 73.4 

Pu22-T15T16 1:2:3 73 



a 10mM Tris buffer (pH 7.2), 50 mM potassium chloride, heating rate at 
2°C/min. 



close to that of the major conformation 1:4:1 
G-quadruplex. The 4°C difference in T m may explain the 
major formation of the 1:4:1 G-quadruplex in the VEGF 
promoter sequence. 

Complete NMR spectra assignment of the major VEGF 
promoter G-quadruplex 

NMR experiments of Pu22-T12T13 were carried out in 
95 mM K + solution. We have also examined this 
sequence in the physiologically relevant 140 mM K + con- 
centration, which gave rise to the same NMR spectrum 
(Supplementary Figure S5). The guanine imino and H8 
protons of Pu22-T12T13 were assigned using 15 N-edited 
HMQC (Figure 2) (36,37). The absence of imino protons 
for G2 and G21 (Figure 2A) indicated that G2 and G21 
were not involved in the G-tetrad formation. 
Noteworthily, the imino protons of G14, G15 and G16 
of Pu22-T12T13 (Figure IB) are almost at the same loca- 
tions as those of the wild-type VEGF-Pu22 (Figure 1C). 
The G-quadruplex formed in Pu22-T12T13 appears to be 
of monomeric nature as shown by the NMR stoichiometry 
titration experiment at the melting temperature 
(Supplementary Figure S6). Pu22-T12T13 forms a 
parallel-stranded intramolecular G-quadruplex with 1:4:1 
loop-size arrangement (Figure ID). This folding topology 
was determined by NOE connectivities of guanine imino 
and H8 protons. In a G-tetrad plane with a Hoogsteen H- 
bond network, the NH1 of a guanine is in close spatial 
vicinity to the NHls of the adjacent guanines and to the 
H8 of one of the adjacent guanines (Figure IE). For 
example, the NOEs of G18H8/G14H1, G14H8/G7H1, 
G7H8/G3H1 and G3H8/G18H1 (Figure 3A) defined the 
tetrad plane of G3-G7-G14-G18. The other two tetrad- 
planes, G4-G8-G15-G19 and G5-G9-G16-G20, were 
determined in a similar way. 

Complete proton assignment of Pu22-T12T13 was 
accomplished by sequential assignment (Figure 3) using 
2D COSY, TOCSY and NOESY at different temperatures 
in both H 2 0 and D 2 0 (35-37). The chemical shifts of all 
proton resonances are listed in Table 2. All of the residues 
appear to adopt anti conformation based on the intensities 
of intra-residue H8-HT cross peaks (Figure 3B). Critical 
inter-residue NOE interactions are summarized in 
Figure 4 and define the overall structure of the VEGF 
promoter G-quadruplex. 

Complete spectral assignment (Supplementary 
Figure S7) was also accomplished for Pu22- 
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Figure 2. Imino (A) and aromatic (B) proton assignments of Pu22- 
T12T13 using ID 15 N-edited HMQC experiments on site-specific 
labeled oligonucleotides. Conditions: 25 mM K-phosphate, 70 mM 
KC1 (pH 7.0), 25°C. 



T12T13A2A21 with additional G-to-A mutations at G2 
and G21. 

NOE interactions define the overall structure of the 
VEGF G-quadruplex and show specific interactions 
between the 4-nt middle loop and flanking sequences 

The guanines on each of the four G-strands are well 
stacked, as indicated by the clear NOE connections of 
adjacent guanine H8 protons, such as G3H8/G4H8, 
G14H8/G15H8 and G19H8/G20H8 (Figure 4). The se- 
quential NOE connectivities along each G-strand are 
clearly observed for (n)GH8 and (n-l)GH17H27H27 
H3', typical for right-handed DNA backbone conform- 
ation (Figures 3B and 4). Inter-tetrad NOE connectivities 
of non-sequential guanines of G-strands, such as G3H8/ 
G19H1 and G4H8/G20H1, G7H8/G4H1 and G8H8/ 
G5H1, G14H8/G8H1 and G15H8/G9H1, and G19H8/ 
G16H1, were clearly observed (Figure 3 A), supporting 
both the folding structure and the right-handed twist of 
the G-strands. The clear NOE cross-peaks between sugar 
HI' and (n+1) H4' or H5 / , // e.g., G3H1 / /G4H5 / / / , G4H17 
G5H5 / , // , G8H17G9H4', H5 / , // , G15H17G16H5 / , // and 
G19H17G20H5 / , // , indicated that the sugar backbones of 
the G-strands are more compact than regular B-DNA 
(35,37). 

The sequential NOE cross-peaks are absent or weak at 
the three double-chain-reversal loops, i.e. C6, C10-C11- 
T12-T13 and CI 7 (Figure 3B). The two 1-nt loop cytosines 
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Figure 3. (A) The H8-H1 region of the 2D-NOESY spectrum of Pu22- 
T12T13 in H 2 0 at 25° C. Intra-tetrad NOEs are in red, inter- tetrad 
NOEs are in blue, and NOEs with flanking bases are in green. (B) The 
Hl ; -H8 region with sequential assignment pathway. Missing 
connectivities are labeled with asterisks. The cytosine H5-H6 NOEs 
are labeled with '+'. 



(C6 and CI 7) show similar chemical shifts, which are both 
downfield-shifted due to the groove location. 
Unexpectedly, T13 of the 4-nt loop appears to stack well 
with the S'-tetrad: in addition to sequential NOEs at the 
T13-G14 step, such as G14H8/T13H6, G14H8/T13H1', 
G14H8/T13H2 / , // and G14H8/T13H3 / (Figure 4), a clear 
NOE is observed between T13H6/G7H1 (Figure 3 A), 
indicating that T13 stacks well with G14 towards the G7 
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Table 2. Proton chemical shifts for Pu22-T12T13 at 25°C a 
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a The chemical shifts are measured in 25 mM K- phosphate, 70 mM KC1 
(pH 7.0) referenced to DSS. 
b Chemical shift measured at 2°C. 



side. Sequential NOEs for stacking interactions are not 
observed for the other three residues of the 4-nt loop, 
and clearly downfield-shifted chemical shifts are suggest- 
ive of their groove location. A number of NOEs are 
observed for the two flanking sequences, both of which 
appear to adopt well-stacking conformations. For the 
S^flanking C1-G2 segment, sequential NOEs are 
observed at the G2-G3 step, such as G3H8/G2H8, 
G3H8/G2HT, H2'," and H3 ; , as well as at the C1-G2 
step (Figures 3B and 4). Surprisingly, the NOE between 
G2H8/G18H1 was strong, indicating that G2 stacks com- 
pletely above the 5 ; -tetrad with its H8 end positioned right 
above G18H1. Similar sequential NOEs are observed for 
the 3 / -flanking G21-T22 segment, i.e. at G21-G20 and 
T22-G21 steps (Figures 3B and 4). A clear NOE 
observed between G21H8/G20H1 (Figure 3A) indicates 
that G21 stacks well with G20. 

NOE-refined solution structure of the VEGF 
G-quadruplex shows unique capping structure involving 
both the 4-nt middle loop and the two flanking segments 

Solution structures of the Pu22-T12T13 G-quadruplex 
were calculated using a NOE-restrained distance 
geometry (DGSA) and restrained molecular dynamics 
(RMD) approach (Figure 5, PDB ID 2m27), starting 
from an arbitrary extended single-stranded DNA. A 
total of 407 NOE distance restraints, including 145 inter- 
residue NOE interactions, were used in the NOE-re- 
strained structure calculation (Supplementary Table SI). 
Dihedral restraints are used for the anti glycosidic torsion 
angle (x) for loop residues. The stereo view of the 10 
lowest energy structures is shown in Figure 5A. The 




|H6H1'H2'H2"H3'H4'H5'"Me1 T22 



Figure 4. Schematic diagram of inter-residue NOE connectivities of 
Pu22-T12T13. 



structure statistics are listed in Supplementary Table SI. 
Pu22-T12T13 forms a well-defined parallel-stranded G- 
quadruplex structure with three tetrads. The two 1-nt 
loops are located in the groove and adopt similar con- 
formations, with extended sugar backbone and the 
cytosine base sticking out to the solvent (Figure 5B). 
The 4-nt double-chain-reversal loop, C10-C1 1-T12-T13, 
interestingly, adopts a unique conformation (Figure 5B). 
The T13 base stacks over the G14 base and appears to be 
hydrogen-bonded with the G2 base of the 5 ; flanking 
segment (Figure 5B-iii). The hydrogen-bond interaction 
was supported by NMR, i.e., the G2 imino proton was 
detected at 2°C at M0.8ppm (Supplementary Figure S2). 
The G2:T13 base pair appears to completely stack over 
the 5 ; G-tetrad (Figure 5B-iii) and thus would experience 
strong ring-current effect. This is shown by the NMR 
data, i.e. a clear upfield- shifting of the chemical shifts 
for sugar protons of G2 and T13, e.g. G2H1 7 , (Figure 
3B, Table 2). The other three residues, CIO, Cll and 
T12, are located in the groove to connect the now four- 
layer structure (three G-tetrads plus one G: T base pair) 
with the C9 and CIO bases pointing out to the solvent. The 
T13, which is involved in the G2:T13 base pair capping 
structure, is a mutation from the wild- type G13. To 
examine the G-quadruplex formed in the wild-type 
sequence VEGF-Pu22, we took the G-quadruplex struc- 
ture formed in Pu22-T12T13 and replaced T12 and T13 
with the wild- type G12 and G13 residues. We carried out 
energy minimization followed by unrestrained molecular 
dynamics simulation for 25 ps at 300 K. Notably, a 
hydrogen-bonded G2:G13 base pair can be nicely 
formed in the wild-type sequence to cap the VEGF G- 
quadruplex quadruplex (Figure 5C). We have collected 
2D NOESY data with a 50 ms mixing time for the wild- 
type sequence VEGF-Pu22. Similar to what was observed 
in the Pu22-T12T13 sequence, no syn conformation was 
observed for any nucleotide in the VEGF-Pu22 sequence 
(Supplementary Figure S8). 
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Figure 5. (A) Stereo view of 10 lowest energy structures of the Pu22- 
T12T13 G-quadruplex by NOE-restrained structure calculation. (B) A 
representative structure of the NMR-refined Pu22-T12T13 G- 
quadruplex in two different views (i, ii); and the 5'-end view of the 
capping structure (magenta) that involves the 4-nt middle loop and 
S'-flanking segment (iii). (C) The molecular model of the wild-type 
VEGF-Pu22 G-quadruplex by unrestrained molecular dynamics simu- 
lation (right). The 5'-end views of the capping structure (magenta) is 
also shown (left). 



DISCUSSION 

The NMR results in the present study unequivocally 
demonstrated that the major intramolecular G- 
quadruplex formed in the VEGF proximal promoter in 
K + solution is a parallel-stranded structure with a 1:4:1 
loop-size arrangement. The minor species, a 1:2:3 loop 
isomer (Supplementary Figure S4), could not be detected 
in the wild-type sequence VEGF-Pu22 by NMR, as the 
imino proton of G13, which is required for the core-tetrad 
of the 1:2:3 loop isomer (Supplementary Figure S4), was 
not detected, even at 2°C (Figure 1C and Supplementary 
Figure S2). The T m of the 1:2:3 loop isomer was shown to 
be 4°C lower than that of the 1:4:1 loop isomer, which 
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Figure 6. Parallel-stranded G-quadruplex-forming promoter sequences. 



may explain the major formation of the 1:4:1 
G-quadruplex in the VEGF promoter sequence. 

Parallel-stranded structures have been found to be 
common in the human promoter G-quadruplexes, such 
as c-MYC (35,44,45), HIF-loc (46), C-KIT21 (47), RET 
(48) and hTERT (49,50). Importantly, all of these 
parallel-stranded promoter G-quadruplexes contain three 
tetrads and two 1-nt loops (first and third), but a variable- 
length middle loop (Figure 6) (26,32). We have previously 
determined the molecular structure of the major 
G-quadruplex formed in the c-MYC promoter, a three- 
tetrad parallel structure with 1:2:1 loop-size arrangement 
(35), which shows that the 1-nt loop is highly favored in 
parallel-stranded G-quadruplexes because of the right- 
handed twist of the adjacent G-strands. Although the 
VEGF G-quadruplex also contains the 1-nt first and 
third loops, the middle loop of the VEGF G-quadruplex 
is 4nt long. Significantly, unlike the 2-nt middle loop of 
the MYC G-quadruplex that stays in the groove, the 4-nt 
middle loop of the VEGF G-quadruplex stretches over the 
5' tetrad to form a unique capping structure with the 
flanking segment. This capping structure was observed 
in the Pu22-T12T13 sequence with two G-to-T mutations 
at the 12 and 13 positions; a similar capping structure was 
also shown to form in the wild-type sequence VEGF-Pu22 
using unrestrained molecular dynamics simulation. It is 
noted that, although the two capping structures in the 
wild-type and mutant sequences are similar, there appear 
to be differences in their respective conformations. For 
example, the G13:G2 capping structure is larger than 
that of the T13:G2 capping structure formed in the 
mutant sequence and would thus cover more of the top 
G-tetrad. In addition, the groove-located wild-type G12 
residue also likely to possess a stronger ring-current effect 
on G7 than that of the mutated T12 (Figure 5), which 
could explain the observed upfleld-shifting of the reson- 
ance of G7 imino proton in VEGF-Pu22 as compared 
with Pu22-T12T13 (Figure 1 and Supplementary Figure 
S4). As such, the 4-nt middle loop of the VEGF G- 
quadruplex appears to play a critical role in forming the 
specific capping structure and stabilizing the most favored 
folding structure. This capping structure represents a 
unique, VEGF sequence- specific loop interaction and 
distinguishes the VEGF G-quadruplex from other 
parallel-stranded structures, such as the MYC 
G-quadruplex whose capping structures are formed 
solely by the flanking segments due to the short 2-nt 
middle loop (35). The specific capping structure of the 
VEGF promoter G-quadruplex may be recognized by 
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small molecule or protein ligands, and the molecular 
structure described in this study could provide a starting 
point for structure-based rational design of quadruplex- 
interactive small molecules targeting VEGF. In conclu- 
sion, although parallel structures are common to the 
promoter G-quadruplexes, our study indicates that each 
G-quadruplex is likely to adopt unique capping structures 
by its specific variable middle loop and flanking segments, 
which together determine the overall structure and specific 
interactions with small molecules or proteins. 
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